{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Custom statespace models\n",
    "\n",
    "The true power of the state space model is to allow the creation and estimation of custom models. This notebook shows various statespace models that subclass `sm.tsa.statespace.MLEModel`.\n",
    "\n",
    "Remember the general state space model can be written in the following general way:\n",
    "\n",
    "$$\n",
    "\\begin{aligned}\n",
    "y_t & = Z_t \\alpha_{t} + d_t +  \\varepsilon_t \\\\\n",
    "\\alpha_{t+1} & = T_t \\alpha_{t} + c_t + R_t \\eta_{t}\n",
    "\\end{aligned}\n",
    "$$\n",
    "\n",
    "You can check the details and the dimensions of the objects [in this link](https://www.statsmodels.org/stable/statespace.html#custom-state-space-models)\n",
    "\n",
    "Most models won't include all of these elements. For example, the design matrix $Z_t$ might not depend on time ($\\forall t \\;Z_t = Z$), or the model won't have an observation intercept $d_t$.\n",
    "\n",
    "We'll start with something relatively simple and then show how to extend it bit by bit to include more elements.\n",
    "\n",
    "+ Model 1: time-varying coefficients. One observation equation with two state equations\n",
    "+ Model 2: time-varying parameters with non identity transition matrix\n",
    "+ Model 3: multiple observation and multiple state equations\n",
    "+ Bonus: pymc3 for Bayesian estimation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "\n",
    "from collections import OrderedDict\n",
    "\n",
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "import statsmodels.api as sm\n",
    "\n",
    "plt.rc(\"figure\", figsize=(16, 8))\n",
    "plt.rc(\"font\", size=15)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Model 1: time-varying coefficients\n",
    "\n",
    "$$\n",
    "\\begin{aligned}\n",
    "y_t & = d + x_t \\beta_{x,t} + w_t \\beta_{w,t} + \\varepsilon_t \\hspace{4em} \\varepsilon_t \\sim N(0, \\sigma_\\varepsilon^2)\\\\\n",
    "\\begin{bmatrix} \\beta_{x,t} \\\\ \\beta_{w,t} \\end{bmatrix} & = \\begin{bmatrix} \\beta_{x,t-1} \\\\ \\beta_{w,t-1} \\end{bmatrix} + \\begin{bmatrix} \\zeta_{x,t} \\\\ \\zeta_{w,t} \\end{bmatrix} \\hspace{3.7em} \\begin{bmatrix} \\zeta_{x,t} \\\\ \\zeta_{w,t} \\end{bmatrix} \\sim N \\left ( \\begin{bmatrix} 0 \\\\ 0 \\end{bmatrix}, \\begin{bmatrix} \\sigma_{\\beta, x}^2 & 0 \\\\ 0 & \\sigma_{\\beta, w}^2 \\end{bmatrix} \\right )\n",
    "\\end{aligned}\n",
    "$$\n",
    "\n",
    "The observed data is $y_t, x_t, w_t$. With $x_t, w_t$ being the exogenous variables. Notice that the design matrix is time-varying, so it will have three dimensions (`k_endog x k_states x nobs`)\n",
    "\n",
    "The states are $\\beta_{x,t}$ and $\\beta_{w,t}$. The state equation tells us these states evolve with a random walk. Thus, in this case the transition matrix is a 2 by 2 identity matrix.\n",
    "\n",
    "We'll first simulate the data, the construct a model and finally estimate it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def gen_data_for_model1():\n",
    "    nobs = 1000\n",
    "\n",
    "    rs = np.random.RandomState(seed=93572)\n",
    "\n",
    "    d = 5\n",
    "    var_y = 5\n",
    "    var_coeff_x = 0.01\n",
    "    var_coeff_w = 0.5\n",
    "\n",
    "    x_t = rs.uniform(size=nobs)\n",
    "    w_t = rs.uniform(size=nobs)\n",
    "    eps = rs.normal(scale=var_y**0.5, size=nobs)\n",
    "\n",
    "    beta_x = np.cumsum(rs.normal(size=nobs, scale=var_coeff_x**0.5))\n",
    "    beta_w = np.cumsum(rs.normal(size=nobs, scale=var_coeff_w**0.5))\n",
    "\n",
    "    y_t = d + beta_x * x_t + beta_w * w_t + eps\n",
    "    return y_t, x_t, w_t, beta_x, beta_w\n",
    "\n",
    "\n",
    "y_t, x_t, w_t, beta_x, beta_w = gen_data_for_model1()\n",
    "_ = plt.plot(y_t)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class TVRegression(sm.tsa.statespace.MLEModel):\n",
    "    def __init__(self, y_t, x_t, w_t):\n",
    "        exog = np.c_[x_t, w_t]  # shaped nobs x 2\n",
    "\n",
    "        super(TVRegression, self).__init__(\n",
    "            endog=y_t, exog=exog, k_states=2, initialization=\"diffuse\"\n",
    "        )\n",
    "\n",
    "        # Since the design matrix is time-varying, it must be\n",
    "        # shaped k_endog x k_states x nobs\n",
    "        # Notice that exog.T is shaped k_states x nobs, so we\n",
    "        # just need to add a new first axis with shape 1\n",
    "        self.ssm[\"design\"] = exog.T[np.newaxis, :, :]  # shaped 1 x 2 x nobs\n",
    "        self.ssm[\"selection\"] = np.eye(self.k_states)\n",
    "        self.ssm[\"transition\"] = np.eye(self.k_states)\n",
    "\n",
    "        # Which parameters need to be positive?\n",
    "        self.positive_parameters = slice(1, 4)\n",
    "\n",
    "    @property\n",
    "    def param_names(self):\n",
    "        return [\"intercept\", \"var.e\", \"var.x.coeff\", \"var.w.coeff\"]\n",
    "\n",
    "    @property\n",
    "    def start_params(self):\n",
    "        \"\"\"\n",
    "        Defines the starting values for the parameters\n",
    "        The linear regression gives us reasonable starting values for the constant\n",
    "        d and the variance of the epsilon error\n",
    "        \"\"\"\n",
    "        exog = sm.add_constant(self.exog)\n",
    "        res = sm.OLS(self.endog, exog).fit()\n",
    "        params = np.r_[res.params[0], res.scale, 0.001, 0.001]\n",
    "        return params\n",
    "\n",
    "    def transform_params(self, unconstrained):\n",
    "        \"\"\"\n",
    "        We constraint the last three parameters\n",
    "        ('var.e', 'var.x.coeff', 'var.w.coeff') to be positive,\n",
    "        because they are variances\n",
    "        \"\"\"\n",
    "        constrained = unconstrained.copy()\n",
    "        constrained[self.positive_parameters] = (\n",
    "            constrained[self.positive_parameters] ** 2\n",
    "        )\n",
    "        return constrained\n",
    "\n",
    "    def untransform_params(self, constrained):\n",
    "        \"\"\"\n",
    "        Need to unstransform all the parameters you transformed\n",
    "        in the `transform_params` function\n",
    "        \"\"\"\n",
    "        unconstrained = constrained.copy()\n",
    "        unconstrained[self.positive_parameters] = (\n",
    "            unconstrained[self.positive_parameters] ** 0.5\n",
    "        )\n",
    "        return unconstrained\n",
    "\n",
    "    def update(self, params, **kwargs):\n",
    "        params = super(TVRegression, self).update(params, **kwargs)\n",
    "\n",
    "        self[\"obs_intercept\", 0, 0] = params[0]\n",
    "        self[\"obs_cov\", 0, 0] = params[1]\n",
    "        self[\"state_cov\"] = np.diag(params[2:4])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### And then estimate it with our custom model class"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mod = TVRegression(y_t, x_t, w_t)\n",
    "res = mod.fit()\n",
    "\n",
    "print(res.summary())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The values that generated the data were:\n",
    "\n",
    "+ intercept = 5\n",
    "+ var.e = 5\n",
    "+ var.x.coeff = 0.01\n",
    "+ var.w.coeff = 0.5\n",
    "\n",
    "\n",
    "As you can see, the estimation recovered the real parameters pretty well.\n",
    "\n",
    "We can also recover the estimated evolution of the underlying coefficients (or states in Kalman filter talk)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, axes = plt.subplots(2, figsize=(16, 8))\n",
    "\n",
    "ss = pd.DataFrame(res.smoothed_state.T, columns=[\"x\", \"w\"])\n",
    "\n",
    "axes[0].plot(beta_x, label=\"True\")\n",
    "axes[0].plot(ss[\"x\"], label=\"Smoothed estimate\")\n",
    "axes[0].set(title=\"Time-varying coefficient on x_t\")\n",
    "axes[0].legend()\n",
    "\n",
    "axes[1].plot(beta_w, label=\"True\")\n",
    "axes[1].plot(ss[\"w\"], label=\"Smoothed estimate\")\n",
    "axes[1].set(title=\"Time-varying coefficient on w_t\")\n",
    "axes[1].legend()\n",
    "\n",
    "fig.tight_layout();"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Model 2: time-varying parameters with non identity transition matrix\n",
    "\n",
    "This is a small extension from Model 1. Instead of having an identity transition matrix, we'll have one with two parameters ($\\rho_1, \\rho_2$) that we need to estimate. \n",
    "\n",
    "\n",
    "$$\n",
    "\\begin{aligned}\n",
    "y_t & = d + x_t \\beta_{x,t} + w_t \\beta_{w,t} + \\varepsilon_t \\hspace{4em} \\varepsilon_t \\sim N(0, \\sigma_\\varepsilon^2)\\\\\n",
    "\\begin{bmatrix} \\beta_{x,t} \\\\ \\beta_{w,t} \\end{bmatrix} & = \\begin{bmatrix} \\rho_1 & 0 \\\\ 0 & \\rho_2 \\end{bmatrix} \\begin{bmatrix} \\beta_{x,t-1} \\\\ \\beta_{w,t-1} \\end{bmatrix} + \\begin{bmatrix} \\zeta_{x,t} \\\\ \\zeta_{w,t} \\end{bmatrix} \\hspace{3.7em} \\begin{bmatrix} \\zeta_{x,t} \\\\ \\zeta_{w,t} \\end{bmatrix} \\sim N \\left ( \\begin{bmatrix} 0 \\\\ 0 \\end{bmatrix}, \\begin{bmatrix} \\sigma_{\\beta, x}^2 & 0 \\\\ 0 & \\sigma_{\\beta, w}^2 \\end{bmatrix} \\right )\n",
    "\\end{aligned}\n",
    "$$\n",
    "\n",
    "\n",
    "What should we modify in our previous class to make things work?\n",
    "+ Good news: not a lot!\n",
    "+ Bad news: we need to be careful about a few things"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1) Change the starting parameters function\n",
    "\n",
    "We need to add names for the new parameters $\\rho_1, \\rho_2$ and we need to start corresponding starting values.\n",
    "\n",
    "The `param_names` function goes from:\n",
    "\n",
    "```python\n",
    "def param_names(self):\n",
    "    return ['intercept', 'var.e', 'var.x.coeff', 'var.w.coeff']\n",
    "```\n",
    "\n",
    "\n",
    "to \n",
    "\n",
    "```python\n",
    "def param_names(self):\n",
    "    return ['intercept', 'var.e', 'var.x.coeff', 'var.w.coeff',\n",
    "           'rho1', 'rho2']\n",
    "```\n",
    "\n",
    "and we change the `start_params` function from \n",
    "\n",
    "```python\n",
    "def start_params(self):\n",
    "    exog = sm.add_constant(self.exog)\n",
    "    res = sm.OLS(self.endog, exog).fit()\n",
    "    params = np.r_[res.params[0], res.scale, 0.001, 0.001]\n",
    "    return params\n",
    "```\n",
    "\n",
    "to\n",
    "\n",
    "```python\n",
    "def start_params(self):\n",
    "    exog = sm.add_constant(self.exog)\n",
    "    res = sm.OLS(self.endog, exog).fit()\n",
    "    params = np.r_[res.params[0], res.scale, 0.001, 0.001, 0.8, 0.8]\n",
    "    return params\n",
    "```\n",
    "\n",
    "2) Change the `update` function\n",
    "\n",
    "It goes from\n",
    "\n",
    "```python\n",
    "def update(self, params, **kwargs):\n",
    "    params = super(TVRegression, self).update(params, **kwargs)\n",
    "\n",
    "    self['obs_intercept', 0, 0] = params[0]\n",
    "    self['obs_cov', 0, 0] = params[1]\n",
    "    self['state_cov'] = np.diag(params[2:4])\n",
    "```\n",
    "\n",
    "\n",
    "to \n",
    "\n",
    "```python\n",
    "def update(self, params, **kwargs):\n",
    "    params = super(TVRegression, self).update(params, **kwargs)\n",
    "\n",
    "    self['obs_intercept', 0, 0] = params[0]\n",
    "    self['obs_cov', 0, 0] = params[1]\n",
    "    self['state_cov'] = np.diag(params[2:4])\n",
    "    self['transition', 0, 0] = params[4]\n",
    "    self['transition', 1, 1] = params[5]\n",
    "```\n",
    "\n",
    "\n",
    "3) (optional) change `transform_params` and `untransform_params`\n",
    "\n",
    "This is not required, but you might wanna restrict $\\rho_1, \\rho_2$ to lie between -1 and 1.\n",
    "In that case, we first import two utility functions from `statsmodels`.\n",
    "\n",
    "\n",
    "```python\n",
    "from statsmodels.tsa.statespace.tools import (\n",
    "    constrain_stationary_univariate, unconstrain_stationary_univariate)\n",
    "```\n",
    "\n",
    "`constrain_stationary_univariate` constraint the value to be within -1 and 1. \n",
    "`unconstrain_stationary_univariate` provides the inverse function.\n",
    "The transform and untransform parameters function would look like this\n",
    "(remember that $\\rho_1, \\rho_2$ are in the 4 and 5th index):\n",
    "\n",
    "```python\n",
    "def transform_params(self, unconstrained):\n",
    "    constrained = unconstrained.copy()\n",
    "    constrained[self.positive_parameters] = constrained[self.positive_parameters]**2\n",
    "    constrained[4] = constrain_stationary_univariate(constrained[4:5])\n",
    "    constrained[5] = constrain_stationary_univariate(constrained[5:6])\n",
    "    return constrained\n",
    "\n",
    "def untransform_params(self, constrained):\n",
    "    unconstrained = constrained.copy()\n",
    "    unconstrained[self.positive_parameters] = unconstrained[self.positive_parameters]**0.5\n",
    "    unconstrained[4] = unconstrain_stationary_univariate(constrained[4:5])\n",
    "    unconstrained[5] = unconstrain_stationary_univariate(constrained[5:6])\n",
    "    return unconstrained\n",
    "```\n",
    "\n",
    "I'll write the full class below (without the optional changes I have just discussed)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class TVRegressionExtended(sm.tsa.statespace.MLEModel):\n",
    "    def __init__(self, y_t, x_t, w_t):\n",
    "        exog = np.c_[x_t, w_t]  # shaped nobs x 2\n",
    "\n",
    "        super(TVRegressionExtended, self).__init__(\n",
    "            endog=y_t, exog=exog, k_states=2, initialization=\"diffuse\"\n",
    "        )\n",
    "\n",
    "        # Since the design matrix is time-varying, it must be\n",
    "        # shaped k_endog x k_states x nobs\n",
    "        # Notice that exog.T is shaped k_states x nobs, so we\n",
    "        # just need to add a new first axis with shape 1\n",
    "        self.ssm[\"design\"] = exog.T[np.newaxis, :, :]  # shaped 1 x 2 x nobs\n",
    "        self.ssm[\"selection\"] = np.eye(self.k_states)\n",
    "        self.ssm[\"transition\"] = np.eye(self.k_states)\n",
    "\n",
    "        # Which parameters need to be positive?\n",
    "        self.positive_parameters = slice(1, 4)\n",
    "\n",
    "    @property\n",
    "    def param_names(self):\n",
    "        return [\"intercept\", \"var.e\", \"var.x.coeff\", \"var.w.coeff\", \"rho1\", \"rho2\"]\n",
    "\n",
    "    @property\n",
    "    def start_params(self):\n",
    "        \"\"\"\n",
    "        Defines the starting values for the parameters\n",
    "        The linear regression gives us reasonable starting values for the constant\n",
    "        d and the variance of the epsilon error\n",
    "        \"\"\"\n",
    "\n",
    "        exog = sm.add_constant(self.exog)\n",
    "        res = sm.OLS(self.endog, exog).fit()\n",
    "        params = np.r_[res.params[0], res.scale, 0.001, 0.001, 0.7, 0.8]\n",
    "        return params\n",
    "\n",
    "    def transform_params(self, unconstrained):\n",
    "        \"\"\"\n",
    "        We constraint the last three parameters\n",
    "        ('var.e', 'var.x.coeff', 'var.w.coeff') to be positive,\n",
    "        because they are variances\n",
    "        \"\"\"\n",
    "        constrained = unconstrained.copy()\n",
    "        constrained[self.positive_parameters] = (\n",
    "            constrained[self.positive_parameters] ** 2\n",
    "        )\n",
    "        return constrained\n",
    "\n",
    "    def untransform_params(self, constrained):\n",
    "        \"\"\"\n",
    "        Need to unstransform all the parameters you transformed\n",
    "        in the `transform_params` function\n",
    "        \"\"\"\n",
    "        unconstrained = constrained.copy()\n",
    "        unconstrained[self.positive_parameters] = (\n",
    "            unconstrained[self.positive_parameters] ** 0.5\n",
    "        )\n",
    "        return unconstrained\n",
    "\n",
    "    def update(self, params, **kwargs):\n",
    "        params = super(TVRegressionExtended, self).update(params, **kwargs)\n",
    "\n",
    "        self[\"obs_intercept\", 0, 0] = params[0]\n",
    "        self[\"obs_cov\", 0, 0] = params[1]\n",
    "        self[\"state_cov\"] = np.diag(params[2:4])\n",
    "        self[\"transition\", 0, 0] = params[4]\n",
    "        self[\"transition\", 1, 1] = params[5]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To estimate, we'll use the same data as in model 1 and expect the $\\rho_1, \\rho_2$ to be near 1.\n",
    "\n",
    "The results look pretty good!\n",
    "Note that this estimation can be quite sensitive to the starting value of $\\rho_1, \\rho_2$. If you try lower values, you'll see it fails to converge."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mod = TVRegressionExtended(y_t, x_t, w_t)\n",
    "res = mod.fit(maxiter=2000)  # it doesn't converge with 50 iters\n",
    "print(res.summary())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Model 3: multiple observation and state equations\n",
    "\n",
    "We'll keep the time-varying parameters, but this time we'll also have two observation equations.\n",
    "\n",
    "### Observation equations\n",
    "\n",
    "$\\hat{i_t}, \\hat{M_t},  \\hat{s_t}$ are observed each period.\n",
    "\n",
    "The model for the observation equation has two equations:\n",
    "\n",
    "$$ \\hat{i_t} = \\alpha_1 * \\hat{s_t} + \\varepsilon_1 $$\n",
    "\n",
    "$$ \\hat{M_t} = \\alpha_2 + \\varepsilon_2 $$\n",
    "\n",
    "Following the [general notation from state space models](https://www.statsmodels.org/stable/statespace.html), the endogenous part of the observation equation is $y_t = (\\hat{i_t},  \\hat{M_t})$ and we only have one exogenous variable $\\hat{s_t}$\n",
    "\n",
    "\n",
    "### State equations\n",
    "\n",
    "\n",
    "$$ \\alpha_{1, t+1} = \\delta_1 \\alpha_{1, t} + \\delta_2 \\alpha_{2, t} + W_1 $$\n",
    "\n",
    "$$ \\alpha_{2, t+1} = \\delta_3  \\alpha_{2, t} + W_2 $$\n",
    "\n",
    "\n",
    "### Matrix notation for the state space model\n",
    "\n",
    "$$\n",
    "\\begin{aligned}\n",
    "\\begin{bmatrix} \\hat{i_t} \\\\ \\hat{M_t} \\end{bmatrix} &= \n",
    "\\begin{bmatrix} \\hat{s_t}  & 0 \\\\ 0 & 1 \\end{bmatrix} \\begin{bmatrix} \\alpha_{1, t} \\\\ \\alpha_{2, t} \\end{bmatrix} + \\begin{bmatrix} \\varepsilon_{1, t} \\\\ \\varepsilon_{1, t} \\end{bmatrix}  \\hspace{6.5em} \\varepsilon_t \\sim N \\left ( \\begin{bmatrix} 0 \\\\ 0 \\end{bmatrix}, \\begin{bmatrix} \\sigma_{\\varepsilon_1}^2 & 0 \\\\ 0 & \\sigma_{\\varepsilon_2}^2  \\end{bmatrix} \\right )\n",
    "\\\\\n",
    "\\begin{bmatrix} \\alpha_{1, t+1} \\\\ \\alpha_{2, t+1} \\end{bmatrix} & = \\begin{bmatrix} \\delta_1 & \\delta_1 \\\\ 0 & \\delta_3 \\end{bmatrix} \\begin{bmatrix} \\alpha_{1, t} \\\\ \\alpha_{2, t} \\end{bmatrix} + \\begin{bmatrix} W_1 \\\\ W_2 \\end{bmatrix} \\hspace{3.em} \\begin{bmatrix} W_1 \\\\ W_2 \\end{bmatrix} \\sim N \\left ( \\begin{bmatrix} 0 \\\\ 0 \\end{bmatrix}, \\begin{bmatrix} \\sigma_{W_1}^2 & 0 \\\\ 0 & \\sigma_{W_2}^2 \\end{bmatrix} \\right )\n",
    "\\end{aligned}\n",
    "$$\n",
    "\n",
    "I'll simulate some data, talk about what we need to modify and finally estimate the model to see if we're recovering something reasonable.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "true_values = {\n",
    "    \"var_e1\": 0.01,\n",
    "    \"var_e2\": 0.01,\n",
    "    \"var_w1\": 0.01,\n",
    "    \"var_w2\": 0.01,\n",
    "    \"delta1\": 0.8,\n",
    "    \"delta2\": 0.5,\n",
    "    \"delta3\": 0.7,\n",
    "}\n",
    "\n",
    "\n",
    "def gen_data_for_model3():\n",
    "    # Starting values\n",
    "    alpha1_0 = 2.1\n",
    "    alpha2_0 = 1.1\n",
    "\n",
    "    t_max = 500\n",
    "\n",
    "    def gen_i(alpha1, s):\n",
    "        return alpha1 * s + np.sqrt(true_values[\"var_e1\"]) * np.random.randn()\n",
    "\n",
    "    def gen_m_hat(alpha2):\n",
    "        return 1 * alpha2 + np.sqrt(true_values[\"var_e2\"]) * np.random.randn()\n",
    "\n",
    "    def gen_alpha1(alpha1, alpha2):\n",
    "        w1 = np.sqrt(true_values[\"var_w1\"]) * np.random.randn()\n",
    "        return true_values[\"delta1\"] * alpha1 + true_values[\"delta2\"] * alpha2 + w1\n",
    "\n",
    "    def gen_alpha2(alpha2):\n",
    "        w2 = np.sqrt(true_values[\"var_w2\"]) * np.random.randn()\n",
    "        return true_values[\"delta3\"] * alpha2 + w2\n",
    "\n",
    "    s_t = 0.3 + np.sqrt(1.4) * np.random.randn(t_max)\n",
    "    i_hat = np.empty(t_max)\n",
    "    m_hat = np.empty(t_max)\n",
    "\n",
    "    current_alpha1 = alpha1_0\n",
    "    current_alpha2 = alpha2_0\n",
    "    for t in range(t_max):\n",
    "        # Obs eqns\n",
    "        i_hat[t] = gen_i(current_alpha1, s_t[t])\n",
    "        m_hat[t] = gen_m_hat(current_alpha2)\n",
    "\n",
    "        # state eqns\n",
    "        new_alpha1 = gen_alpha1(current_alpha1, current_alpha2)\n",
    "        new_alpha2 = gen_alpha2(current_alpha2)\n",
    "\n",
    "        # Update states for next period\n",
    "        current_alpha1 = new_alpha1\n",
    "        current_alpha2 = new_alpha2\n",
    "\n",
    "    return i_hat, m_hat, s_t\n",
    "\n",
    "\n",
    "i_hat, m_hat, s_t = gen_data_for_model3()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### What do we need to modify?\n",
    "\n",
    "Once again, we don't need to change much, but we need to be careful about the dimensions.\n",
    "\n",
    "#### 1) The `__init__` function changes from\n",
    "\n",
    "\n",
    "```python\n",
    "def __init__(self, y_t, x_t, w_t):\n",
    "        exog = np.c_[x_t, w_t]\n",
    "\n",
    "        super(TVRegressionExtended, self).__init__(\n",
    "            endog=y_t, exog=exog, k_states=2,\n",
    "            initialization='diffuse')\n",
    "\n",
    "        self.ssm['design'] = exog.T[np.newaxis, :, :]  # shaped 1 x 2 x nobs\n",
    "        self.ssm['selection'] = np.eye(self.k_states)\n",
    "        self.ssm['transition'] = np.eye(self.k_states)\n",
    "```\n",
    "\n",
    "to\n",
    "\n",
    "\n",
    "```python\n",
    "def __init__(self, i_t: np.array, s_t: np.array, m_t: np.array):\n",
    "\n",
    "        exog = np.c_[s_t, np.repeat(1, len(s_t))]  # exog.shape => (nobs, 2)\n",
    "        \n",
    "        super(MultipleYsModel, self).__init__(\n",
    "            endog=np.c_[i_t, m_t], exog=exog, k_states=2,\n",
    "            initialization='diffuse')\n",
    "        \n",
    "        self.ssm['design'] = np.zeros((self.k_endog, self.k_states, self.nobs))\n",
    "        self.ssm['design', 0, 0, :] = s_t\n",
    "        self.ssm['design', 1, 1, :] = 1\n",
    "```\n",
    "\n",
    "Note that we did not have to specify `k_endog` anywhere. The initialization does this for us after checking the dimensions of the `endog` matrix.\n",
    "\n",
    "\n",
    "#### 2) The `update()` function \n",
    "\n",
    "changes from \n",
    "\n",
    "```python\n",
    "def update(self, params, **kwargs):\n",
    "    params = super(TVRegressionExtended, self).update(params, **kwargs)\n",
    "\n",
    "    self['obs_intercept', 0, 0] = params[0]\n",
    "    self['obs_cov', 0, 0] = params[1]\n",
    "    \n",
    "    self['state_cov'] = np.diag(params[2:4])\n",
    "    self['transition', 0, 0] = params[4]\n",
    "    self['transition', 1, 1] = params[5]\n",
    "```\n",
    "\n",
    "\n",
    "to\n",
    "\n",
    "\n",
    "```python\n",
    "def update(self, params, **kwargs):\n",
    "    params = super(MultipleYsModel, self).update(params, **kwargs)\n",
    "    \n",
    "    \n",
    "    #The following line is not needed (by default, this matrix is initialized by zeroes),\n",
    "    #But I leave it here so the dimensions are clearer\n",
    "    self['obs_intercept'] = np.repeat([np.array([0, 0])], self.nobs, axis=0).T\n",
    "    self['obs_cov', 0, 0] = params[0]\n",
    "    self['obs_cov', 1, 1] = params[1]\n",
    "\n",
    "    self['state_cov'] = np.diag(params[2:4])\n",
    "    #delta1, delta2, delta3\n",
    "    self['transition', 0, 0] = params[4]\n",
    "    self['transition', 0, 1] = params[5]\n",
    "    self['transition', 1, 1] = params[6]\n",
    "```\n",
    "\n",
    "The rest of the methods change in pretty obvious ways (need to add parameter names, make sure the indexes work, etc). The full code for the function is right below"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "starting_values = {\n",
    "    \"var_e1\": 0.2,\n",
    "    \"var_e2\": 0.1,\n",
    "    \"var_w1\": 0.15,\n",
    "    \"var_w2\": 0.18,\n",
    "    \"delta1\": 0.7,\n",
    "    \"delta2\": 0.1,\n",
    "    \"delta3\": 0.85,\n",
    "}\n",
    "\n",
    "\n",
    "class MultipleYsModel(sm.tsa.statespace.MLEModel):\n",
    "    def __init__(self, i_t: np.array, s_t: np.array, m_t: np.array):\n",
    "\n",
    "        exog = np.c_[s_t, np.repeat(1, len(s_t))]  # exog.shape => (nobs, 2)\n",
    "\n",
    "        super(MultipleYsModel, self).__init__(\n",
    "            endog=np.c_[i_t, m_t], exog=exog, k_states=2, initialization=\"diffuse\"\n",
    "        )\n",
    "\n",
    "        self.ssm[\"design\"] = np.zeros((self.k_endog, self.k_states, self.nobs))\n",
    "        self.ssm[\"design\", 0, 0, :] = s_t\n",
    "        self.ssm[\"design\", 1, 1, :] = 1\n",
    "\n",
    "        # These have ok shape. Placeholders since I'm changing them\n",
    "        # in the update() function\n",
    "        self.ssm[\"selection\"] = np.eye(self.k_states)\n",
    "        self.ssm[\"transition\"] = np.eye(self.k_states)\n",
    "\n",
    "        # Dictionary of positions to names\n",
    "        self.position_dict = OrderedDict(\n",
    "            var_e1=1, var_e2=2, var_w1=3, var_w2=4, delta1=5, delta2=6, delta3=7\n",
    "        )\n",
    "        self.initial_values = starting_values\n",
    "        self.positive_parameters = slice(0, 4)\n",
    "\n",
    "    @property\n",
    "    def param_names(self):\n",
    "        return list(self.position_dict.keys())\n",
    "\n",
    "    @property\n",
    "    def start_params(self):\n",
    "        \"\"\"\n",
    "        Initial values\n",
    "        \"\"\"\n",
    "        # (optional) Use scale for var_e1 and var_e2 starting values\n",
    "        params = np.r_[\n",
    "            self.initial_values[\"var_e1\"],\n",
    "            self.initial_values[\"var_e2\"],\n",
    "            self.initial_values[\"var_w1\"],\n",
    "            self.initial_values[\"var_w2\"],\n",
    "            self.initial_values[\"delta1\"],\n",
    "            self.initial_values[\"delta2\"],\n",
    "            self.initial_values[\"delta3\"],\n",
    "        ]\n",
    "        return params\n",
    "\n",
    "    def transform_params(self, unconstrained):\n",
    "        \"\"\"\n",
    "        If you need to restrict parameters\n",
    "        For example, variances should be > 0\n",
    "        Parameters maybe have to be within -1 and 1\n",
    "        \"\"\"\n",
    "        constrained = unconstrained.copy()\n",
    "        constrained[self.positive_parameters] = (\n",
    "            constrained[self.positive_parameters] ** 2\n",
    "        )\n",
    "        return constrained\n",
    "\n",
    "    def untransform_params(self, constrained):\n",
    "        \"\"\"\n",
    "        Need to reverse what you did in transform_params()\n",
    "        \"\"\"\n",
    "        unconstrained = constrained.copy()\n",
    "        unconstrained[self.positive_parameters] = (\n",
    "            unconstrained[self.positive_parameters] ** 0.5\n",
    "        )\n",
    "        return unconstrained\n",
    "\n",
    "    def update(self, params, **kwargs):\n",
    "        params = super(MultipleYsModel, self).update(params, **kwargs)\n",
    "\n",
    "        # The following line is not needed (by default, this matrix is initialized by zeroes),\n",
    "        # But I leave it here so the dimensions are clearer\n",
    "        self[\"obs_intercept\"] = np.repeat([np.array([0, 0])], self.nobs, axis=0).T\n",
    "\n",
    "        self[\"obs_cov\", 0, 0] = params[0]\n",
    "        self[\"obs_cov\", 1, 1] = params[1]\n",
    "\n",
    "        self[\"state_cov\"] = np.diag(params[2:4])\n",
    "\n",
    "        # delta1, delta2, delta3\n",
    "        self[\"transition\", 0, 0] = params[4]\n",
    "        self[\"transition\", 0, 1] = params[5]\n",
    "        self[\"transition\", 1, 1] = params[6]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mod = MultipleYsModel(i_hat, s_t, m_hat)\n",
    "res = mod.fit()\n",
    "\n",
    "print(res.summary())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Bonus: pymc3 for fast Bayesian estimation\n",
    "\n",
    "In this section I'll show how you can take your custom state space model and easily plug it to `pymc3` and estimate it with Bayesian methods. In particular, this example will show you an estimation with a version of Hamiltonian Monte Carlo called the No-U-Turn Sampler (NUTS).\n",
    "\n",
    "I'm basically copying the ideas contained [in this notebook](https://www.statsmodels.org/dev/examples/notebooks/generated/statespace_sarimax_pymc3.html), so make sure to check that for more details."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Extra requirements\n",
    "import pymc3 as pm\n",
    "import theano\n",
    "import theano.tensor as tt"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We need to define some helper functions to connect theano to the likelihood function that is implied in our model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Loglike(tt.Op):\n",
    "\n",
    "    itypes = [tt.dvector]  # expects a vector of parameter values when called\n",
    "    otypes = [tt.dscalar]  # outputs a single scalar value (the log likelihood)\n",
    "\n",
    "    def __init__(self, model):\n",
    "        self.model = model\n",
    "        self.score = Score(self.model)\n",
    "\n",
    "    def perform(self, node, inputs, outputs):\n",
    "        (theta,) = inputs  # contains the vector of parameters\n",
    "        llf = self.model.loglike(theta)\n",
    "        outputs[0][0] = np.array(llf)  # output the log-likelihood\n",
    "\n",
    "    def grad(self, inputs, g):\n",
    "        # the method that calculates the gradients - it actually returns the\n",
    "        # vector-Jacobian product - g[0] is a vector of parameter values\n",
    "        (theta,) = inputs  # our parameters\n",
    "        out = [g[0] * self.score(theta)]\n",
    "        return out\n",
    "\n",
    "\n",
    "class Score(tt.Op):\n",
    "    itypes = [tt.dvector]\n",
    "    otypes = [tt.dvector]\n",
    "\n",
    "    def __init__(self, model):\n",
    "        self.model = model\n",
    "\n",
    "    def perform(self, node, inputs, outputs):\n",
    "        (theta,) = inputs\n",
    "        outputs[0][0] = self.model.score(theta)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We'll simulate again the data we used for model 1.\n",
    "We'll also `fit` it again and save the results to compare them to the Bayesian posterior we get."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y_t, x_t, w_t, beta_x, beta_w = gen_data_for_model1()\n",
    "plt.plot(y_t)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mod = TVRegression(y_t, x_t, w_t)\n",
    "res_mle = mod.fit(disp=False)\n",
    "print(res_mle.summary())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Bayesian estimation\n",
    "\n",
    "We need to define a prior for each parameter and the number of draws and burn-in points"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Set sampling params\n",
    "ndraws = 3000  #  3000 number of draws from the distribution\n",
    "nburn = 600  # 600 number of \"burn-in points\" (which will be discarded)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Construct an instance of the Theano wrapper defined above, which\n",
    "# will allow PyMC3 to compute the likelihood and Jacobian in a way\n",
    "# that it can make use of. Here we are using the same model instance\n",
    "# created earlier for MLE analysis (we could also create a new model\n",
    "# instance if we preferred)\n",
    "loglike = Loglike(mod)\n",
    "\n",
    "with pm.Model():\n",
    "    # Priors\n",
    "    intercept = pm.Uniform(\"intercept\", 1, 10)\n",
    "    var_e = pm.InverseGamma(\"var.e\", 2.3, 0.5)\n",
    "    var_x_coeff = pm.InverseGamma(\"var.x.coeff\", 2.3, 0.1)\n",
    "    var_w_coeff = pm.InverseGamma(\"var.w.coeff\", 2.3, 0.1)\n",
    "\n",
    "    # convert variables to tensor vectors\n",
    "    theta = tt.as_tensor_variable([intercept, var_e, var_x_coeff, var_w_coeff])\n",
    "\n",
    "    # use a DensityDist (use a lamdba function to \"call\" the Op)\n",
    "    pm.DensityDist(\"likelihood\", loglike, observed=theta)\n",
    "\n",
    "    # Draw samples\n",
    "    trace = pm.sample(\n",
    "        ndraws,\n",
    "        tune=nburn,\n",
    "        return_inferencedata=True,\n",
    "        cores=1,\n",
    "        compute_convergence_checks=False,\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### How does the posterior distribution compare with the MLE estimation?\n",
    "\n",
    "The clearly peak around the MLE estimate."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "results_dict = {\n",
    "    \"intercept\": res_mle.params[0],\n",
    "    \"var.e\": res_mle.params[1],\n",
    "    \"var.x.coeff\": res_mle.params[2],\n",
    "    \"var.w.coeff\": res_mle.params[3],\n",
    "}\n",
    "plt.tight_layout()\n",
    "_ = pm.plot_trace(\n",
    "    trace,\n",
    "    lines=[(k, {}, [v]) for k, v in dict(results_dict).items()],\n",
    "    combined=True,\n",
    "    figsize=(12, 12),\n",
    ")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
